Classifying Documents with Poisson Mixtures
نویسندگان
چکیده
منابع مشابه
Classifying Documents Without Labels
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform classification of text rely on the existence of a sample of documents whose class labels are known. However, in many situations, obtaining this sample may not be an easy (or even possible) task. Consider for instance, a set o...
متن کاملLearning with Taxonomies: Classifying Documents and Words
Automatically extracting semantic information about word meaning and document topic from text typically involves an extensive number of classes. Such classes may represent predefined word senses, topics or document categories and are often organized in a taxonomy. The latter encodes important information, which should be exploited in learning classifiers from labeled training data. To that exte...
متن کاملClassifying with Gaussian Mixtures and Clusters
In this paper, we derive classifiers which are winner-take-all (WTA) approximations to a Bayes classifier with Gaussian mixtures for class conditional densities. The derived classifiers include clustering based algorithms like LVQ and k-Means. We propose a constrained rank Gaussian mixtures model and derive a WTA algorithm for it. Our experiments with two speech classification tasks indicate th...
متن کاملOn Poisson–Tweedie mixtures
*Correspondence: [email protected] 1Department of Mathematics, Ohio University, Athens, OH, USA Full list of author information is available at the end of the article Abstract Poisson-Tweedie mixtures are the Poisson mixtures for which the mixing measure is generated by those members of the family of Tweedie distributions whose support is non-negative. This class of non-negative integer-valued ...
متن کامل- 1 - Poisson Mixtures
Shannon (1948) showed that a wide range of practical problems can be reduced to the problem of estimating probability distributions of words and ngrams in text. It has become standard practice in text compression, speech recognition, information retrieval and many other applications of Shannon’s theory to introduce a ‘‘bag-of-words’’ assumption. But obviously, word rates vary from genre to genr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions on Machine Learning and Artificial Intelligence
سال: 2014
ISSN: 2054-7390
DOI: 10.14738/tmlai.24.388